-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
container-encapsulate: make build_mapping_recurse significantly faster #4768
Conversation
Hi @tymlipari. Thanks for your PR. I'm waiting for a coreos member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
One note: I said "effectively" equivalent to call out a small delta I noticed when doing a test compose of a Silverblue 39 repo. Every file digest (I dumped
These files only have |
Thanks @tymlipari ! I agree this is an annoying paper cut. I only skimmed your code but I find myself wondering if we are missing some sort of potential much simpler change here to optimize things. As I understand things, the rpm database is providing an index for this...is there some sort of flag we need to pass? One historical major foot-gun in lib rpm as I understand it is sometimes it does signature verification by default and one needs to do e.g. |
@cgwalters I'm not sure there is, at least not without needing to expose new functionality from within rpm. In the RPM source code, the only place I see it build a full fingerprint cache of its own for the full package set is within So then I started digging through RPM to understand the existing implementation ( Looking at how RPM handles The current implementation uses the So an alternative is essentially what I've coded up here - take essentially the raw database contents and cache them locally to speed up the lookup. (ninja edit: This results in essentially a single database query, but means we have to hold more memory on our end). One caveat though is that it's still necessary to resolve the path against the OSTree repo fs contents, because several RPMs have paths in the database that don't match their real filesystem paths (e.g. Other approaches that could work, but would require I think either more buy-in from RPM or more risk from rpm-ostree:
|
Hmm...can we get the same effect here by forking off the target tree as a container and running librpm from inside there, talking to it over IPC? |
I believe that would only fix the issue of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you've obviously put some effort into this and it is clearly solving a real problem. To be honest I was not happy with the previous code at all. But, I also am not finding myself thinking that this new approach is an unambiguous improvement...the chance for new bugs seems somewhat high.
I'd really like to be able to efficiently backreference files from librpm without us carrying a new cache.
Again I'm not opposed to just merging this but let's discuss some more |
(To pass CI you need to To be very honest if you're likely to make more contributions later, I'm inclined to click commit and we can keep improving this more. |
time to build an Onyx ostree container (in a mock chroot on my laptop) with this patch:
without it:
|
To keep conversation in one place, moving this here:
It's not quite "backporting" if a PR isn't merged 😄 Well...yeah. I still have the feeling we should be able to figure out how to optimize the rpmdb lookup without a whole other mapping but...yes, it's unlikely I (or anyone else) will have bandwith in the near future to dive into this more, so...in the interest of forward progress I will take a look at just doing some cleanups on this and merging. |
@tymlipari can you allow me to force-push to your branch? There should be a checkbox on this PR |
src/libpriv/rpmostree-strcache.h
Outdated
@@ -0,0 +1,43 @@ | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs an SPDX header here
@@ -6094,16 +6124,6 @@ cache_branch_to_nevra (::rust::Str nevra) noexcept | |||
rpmostreecxx$cxxbridge1$cache_branch_to_nevra (nevra, &return$.value); | |||
return ::std::move (return$.value); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason this code is moved? Maybe split this change into another commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a result of me running ./ci/verify-cxx.sh
, I can revert if you'd like though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, then maybe let's split this change into its own commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is minor so optional.
Apologies, I hadn't been able to get around to this as quickly as I hoped. I've updated the PR to move all of the logic behind this into the Rust side of the codebase. Now, rather than walking the filesystem and asking rpm which package provides it, we walk each package's file set and look it up in the ostree filesystem. At runtime rather than directly building the content map, it builds two maps: a) Overall, this is essentially the same algorithm as the previous iteration of this PR, but a bit more cleanly integrated with the existing codebase. |
rust/src/fsutil.rs
Outdated
} | ||
|
||
// Resolve our parent, then resolve ourselves as a direct child | ||
if let Some(parent) = resolve_ostree_paths(path.parent().unwrap(), fsroot, cache) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor style nit, to avoid deep nesting below, we can do
let parent = if let Some(parent) = parent {
parent
} else {
return Ok(None);
}
It feels weird to mention the variable 4 times, but it does keep the code simpler in the end.
rust/src/fsutil.rs
Outdated
let link_target = child_info.symlink_target().unwrap(); | ||
|
||
// Due to a bug in OSTree's Gio.File implementation, we cannot | ||
// just do `parent.resolve_relative_path` here as it doesn't correctly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eek.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this was a frustrating find. I couldn't figure out why some paths weren't matching, and then after adding a bunch of debugging statements found it was producing paths like /usr/lib/something/../../usr/bin/something_else
rather than resolving the ..
paths along the way. The documentation for Gio.File says that resolve_relative_path
should always produce an absolute path, which I guess it technically is here, but IMO it'd be more correct to resolve the real path. Especially because when you call query_info
on the resulting file, it doesn't actually resolve the real file info (I was getting FileType::Unknown
).
I'll try and see if I can cobble together a minified repro and file an issue over on the ostree repo. This logic I added does appear to handle these kinds of paths for us just fine when testing a Silverblue build, but still annoying to have to handle.
/ok-to-test |
Why do you have a revert commit there in the middle? Can you squash the commits please? |
20762ff
to
12d4a7d
Compare
Ah, I usually like to preserve the full history of a PR and squash when it's landed, but I can go ahead and squash it now instead. |
Thanks a lot for working on this. This looks much more readable in Rust :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
BTW for future changes can you include all the content from the PR description in commit messages? We try to have Thanks again! |
While toying around with building my own custom FCOS builds, I noticed that running
cosa build container
with a package set similar to Silverblue's resulted in ~2hr builds, the vast majority of which was in the "Building package mapping" task. After this change, the runtime on my build shrank to ~15 mins.$ time cosa build container
Before
After
The speedup is accomplished by avoiding the need to query the rpmdb for every file. Instead the rpmdb is walked to build a cache of the files to providing packages, so that when the ostree filesystem is walked later it can just check the cache. The cache is structured similarly to rpm's internals, where paths are maintained as separate basename and dirname entries. Additionally, like rpm, the paths are considered equivalent if the dirnames resolve to the same path (rpm uses
stat
to compare inodes, this implementation resolves the symlinks). This results in output that is effectively equivalent to the previous implementation while being substantially faster.To minimize memory overhead maintaining the file mapping, a simple string cache is also added.
Closes: #4880